204 research outputs found
On the Challenges and Perspectives of Foundation Models for Medical Image Analysis
This article discusses the opportunities, applications and future directions
of large-scale pre-trained models, i.e., foundation models, for analyzing
medical images. Medical foundation models have immense potential in solving a
wide range of downstream tasks, as they can help to accelerate the development
of accurate and robust models, reduce the large amounts of required labeled
data, preserve the privacy and confidentiality of patient data. Specifically,
we illustrate the "spectrum" of medical foundation models, ranging from general
vision models, modality-specific models, to organ/task-specific models,
highlighting their challenges, opportunities and applications. We also discuss
how foundation models can be leveraged in downstream medical tasks to enhance
the accuracy and efficiency of medical image analysis, leading to more precise
diagnosis and treatment decisions
Multispectral Deep Neural Networks for Pedestrian Detection
Multispectral pedestrian detection is essential for around-the-clock
applications, e.g., surveillance and autonomous driving. We deeply analyze
Faster R-CNN for multispectral pedestrian detection task and then model it into
a convolutional network (ConvNet) fusion problem. Further, we discover that
ConvNet-based pedestrian detectors trained by color or thermal images
separately provide complementary information in discriminating human instances.
Thus there is a large potential to improve pedestrian detection by using color
and thermal images in DNNs simultaneously. We carefully design four ConvNet
fusion architectures that integrate two-branch ConvNets on different DNNs
stages, all of which yield better performance compared with the baseline
detector. Our experimental results on KAIST pedestrian benchmark show that the
Halfway Fusion model that performs fusion on the middle-level convolutional
features outperforms the baseline method by 11% and yields a missing rate 3.5%
lower than the other proposed architectures.Comment: 13 pages, 8 figures, BMVC 2016 ora
Predicting Fracture Energies and Crack-Tip Fields of Soft Tough Materials
Soft materials including elastomers and gels are pervasive in biological
systems and technological applications. Whereas it is known that intrinsic
fracture energies of soft materials are relatively low, how the intrinsic
fracture energy cooperates with mechanical dissipation in process zone to give
high fracture toughness of soft materials is not well understood. In addition,
it is still challenging to predict fracture energies and crack-tip strain
fields of soft tough materials. Here, we report a scaling theory that accounts
for synergistic effects of intrinsic fracture energies and dissipation on the
toughening of soft materials. We then develop a coupled cohesive-zone and
Mullins-effect model capable of quantitatively predicting fracture energies of
soft tough materials and strain fields around crack tips in soft materials
under large deformation. The theory and model are quantitatively validated by
experiments on fracture of soft tough materials under large deformations. We
further provide a general toughening diagram that can guide the design of new
soft tough materials.Comment: 22 pages, 5 figure
High-performance, flexible thermoelectric generator based on bulk materials
the Centers for Mechanical Engineering Research and Education at MIT and SUSTec
Semi-supervised Pathological Image Segmentation via Cross Distillation of Multiple Attentions
Segmentation of pathological images is a crucial step for accurate cancer
diagnosis. However, acquiring dense annotations of such images for training is
labor-intensive and time-consuming. To address this issue, Semi-Supervised
Learning (SSL) has the potential for reducing the annotation cost, but it is
challenged by a large number of unlabeled training images. In this paper, we
propose a novel SSL method based on Cross Distillation of Multiple Attentions
(CDMA) to effectively leverage unlabeled images. Firstly, we propose a
Multi-attention Tri-branch Network (MTNet) that consists of an encoder and a
three-branch decoder, with each branch using a different attention mechanism
that calibrates features in different aspects to generate diverse outputs.
Secondly, we introduce Cross Decoder Knowledge Distillation (CDKD) between the
three decoder branches, allowing them to learn from each other's soft labels to
mitigate the negative impact of incorrect pseudo labels in training.
Additionally, uncertainty minimization is applied to the average prediction of
the three branches, which further regularizes predictions on unlabeled images
and encourages inter-branch consistency. Our proposed CDMA was compared with
eight state-of-the-art SSL methods on the public DigestPath dataset, and the
experimental results showed that our method outperforms the other approaches
under different annotation ratios. The code is available at
\href{https://github.com/HiLab-git/CDMA}{https://github.com/HiLab-git/CDMA.}Comment: Provisional Accepted by MICCAI 202
Pathology-and-genomics Multimodal Transformer for Survival Outcome Prediction
Survival outcome assessment is challenging and inherently associated with
multiple clinical factors (e.g., imaging and genomics biomarkers) in cancer.
Enabling multimodal analytics promises to reveal novel predictive patterns of
patient outcomes. In this study, we propose a multimodal transformer
(PathOmics) integrating pathology and genomics insights into colon-related
cancer survival prediction. We emphasize the unsupervised pretraining to
capture the intrinsic interaction between tissue microenvironments in gigapixel
whole slide images (WSIs) and a wide range of genomics data (e.g.,
mRNA-sequence, copy number variant, and methylation). After the multimodal
knowledge aggregation in pretraining, our task-specific model finetuning could
expand the scope of data utility applicable to both multi- and single-modal
data (e.g., image- or genomics-only). We evaluate our approach on both TCGA
colon and rectum cancer cohorts, showing that the proposed approach is
competitive and outperforms state-of-the-art studies. Finally, our approach is
desirable to utilize the limited number of finetuned samples towards
data-efficient analytics for survival outcome prediction. The code is available
at https://github.com/Cassie07/PathOmics.Comment: Accepted to MICCAI2023 (Top14%
- …